Method and apparatuses for improving quality of digitally encoded speech in the presence of interference

ABSTRACT

Methods and apparatuses for encoding speech signals in the presence of interference that accurately establish a speech signal value subsequent to lost transmission packets. For one embodiment of the present invention, the initial bits of a speech transmission packet are encoded using a PCM encoding scheme and the remaining bits are encoded using a CVSD encoding scheme. Upon encoding, the initial bits of each packet, the instantaneous value of the voltage as derived from CVSD coder/decoder at the transmitter is encoded using PCM coding rather than CVSD coding.  
     At the receiver, each packet is decoded independently using the PCM-encoded bits, rather than the terminal value of a preceding packet, to define a starting value. The PCM encoded bits of a valid packet are used to reestablish the signal value, thus avoiding packet-to-packet error in the presence of burst interference.

FIELD

[0001] Embodiments of the invention relates generally to the filed ofpacket-switched voice transmission systems, and more particularly to animproved method of encoding speech signals for systems subject to bursterrors.

BACKGROUND

[0002] Of the available speech digitization techniques, one of the morepopular is the waveform follower technique that attempts to emulate thespeech waveform. Although the waveform follower technique requires moretransmission bandwidth than other techniques, it has been preferred dueto its simple implementation architecture and low processing and powerrequirements. Two types of waveform follower speech signal encodingtechniques are pulse code modulation (PCM) and continuously variableslope delta modulation (CVSD). PCM, which is basically a quantized pulseamplitude modulation, obtains an adequate representation of an analogsignal by sampling the signal and encoding each sample as anapproximation to one of several allowable discrete values. A typical PCMtechnique samples the analog speech signal 8000 times per second. Eachsample is represented by 8 bits for a total bit rate requirement of 64Kbps.

[0003] In contrast, CVSD does not encode each signal sampleapproximation, but instead encodes discrete increments of the signal,relative to the previous sample approximation. FIG. 1A illustrates aCVSD digital transmission scheme in accordance with the prior art. Asshown in FIG. 1A, analog signal 105 is sampled six times, namely t₀-t₅.The initial value at t₀ is the reference value for the next subsequentsample. Subsequent increases in the value of the signal are encoded as a1, whereas subsequent decreases in the signal are encoded as a 0. At t₁the signal has increases from t₀ and therefore a 1 is transmitted. At t₂the signal has decreased from t₁ and therefore a 0 is transmitted, andso on. CVSD encoding chart 110 illustrates the encoding of signal 105.The signal can then be reconstructed by increasing the value of thereconstructed signal in response to a 1 being received, and decreasingthe reconstructed signal in response to a 0 being received. Because CVSDdoes not transmit each approximate signal sample, but only a relativechange in the signal, CVSD requires a significantly lower bit rate thanPCM. A typical CVSD technique requires a bit rate of 32 Kbps. In theradio domain, where bandwidth is a concern, CVSD has been preferred overPCM because CVSD provides equivalent speech quality with approximatelyhalf the bit rate requirement. Additionally, for radio, a baseline ofabout 16 Kbps is typically considered sufficient to provide adequatequality, so CVSD provides more than adequate quality.

[0004] The use of CVSD, however, presents a drawback for systems subjectto burst errors. From FIG. 1A, it can be appreciated that CVSD dependsheavily on previous data to accurately reconstruct a signal. For bursterrors (burst interference), an entire packet of data, or more, may becorrupted at one time. This means that some previous data, which theCVSD scheme depends so heavily upon, may be lost. CVSD speech encodingis subject to error extension and severe loss of speech quality whensubject to losses of packets. FIG. 1B provides an illustration of theeffect of a burst error on signal recovery using CVSD. Signal 110suffers a burst error from time t₃-t₆. At time t₆ the reference to thesignal has been reestablished. The reconstruction of the signal usingCSVD compares the signal at t₃ (0) to the signal at t₆ (−1). Since thevalue at t₆ is lower, the CVSD scheme sends the reconstructed signallower. The value of the signal at time t₇, (0) and the value of thereconstructed signal at time t₇, (−2) is now totally distorted. Thedistortion continues at t₈ and t₉, and may continue from one packet toanother. Due to gaps, typical in speech signals, there is a tendency forthe signal to revert to zero periodically which eventually ends theerror propagation.

[0005] Systems that employ frequency hopping are prone to burst errors.Frequency hopping may be employed where multiple systems are in use inrelatively small area. Each device randomly hops from one frequency toanother until a frequency is found that is not in use by some otherdevice at the time. The device may then use the frequency to communicatefor a short time before hopping to another available frequency. Thus,the problem of trying to assign a designated frequency for to eachdevice in a dynamic (e.g., mobile) environment is avoided. However,because the hopping is random, there are instances where two or moredevices have selected the same frequency causing mutual interference.

[0006] The short-range networking protocol Bluetooth is an example of afrequency-hopping system. Bluetooth hops over a frequency band of 2.402GHz to 2.48 GHz in 1 MHz increments for a total of 79 channels. TheBluetooth protocol provides for frequency hopping at the rate of 1600hops per second with 64 bits of data in each hop.

[0007] Frequency hopping wireless systems operating in a congested RFenvironment such as Bluetooth may address the problem of interferencewith a data transmission by requesting a retransmission, however tomaintain quality speech transmissions, the delay associated withretransmission must be avoided. Such systems must be able to extrapolateacross lost packets.

BRIEF DESCRIPTION OF THE DRAWINGS

[0008] Embodiments of the present invention are illustrated by way ofexample, and not limitation, by the figures of the accompanying drawingsin which like references indicate similar elements and in which:

[0009]FIG. 1A illustrates a CVSD digital transmission scheme inaccordance with the prior art;

[0010]FIG. 1B provides an illustration of the effect of a burst error onsignal recovery using CVSD;

[0011]FIG. 2 illustrates a functional block diagram of theencoder/transmitter for encoding a speech signal in accordance with anembodiment of the present invention;

[0012]FIG. 3 illustrates a functional block diagram of thereceiver/decoder for decoding an encoded packet in accordance with anembodiment of the present invention; and

[0013]FIG. 4 is a diagram illustrating an exemplary processing system400 for implementing an embodiment of the present invention.

DETAILED DESCRIPTION

[0014] An embodiment of the present invention provides a method foraccurately establishing a speech signal value subsequent to losttransmission packets. An embodiment of the invention can be implementedvia a speech data transmission packet (packet) as used in Bluetooth andother frequency hopping wireless networks. For one embodiment of thepresent invention, a packet is encoded using an encoding techniquewherein a predetermined number of initial bits are encoded using a PCMencoding scheme and the remaining bits are encoded using a CVSD encodingscheme. Upon encoding the initial bits of each packet, the instantaneousvalue of the voltage as derived from CVSD coder/decoder at thetransmitter is encoded using PCM coding rather than CVSD coding.

[0015] At the receiver, each packet is decoded independently using thePCM-encoded bits to define a starting value, rather than using the endvalue of a preceding packet. That is, the PCM-encoded initial bits ofeach packet are used to define the value of the speech sample at thebeginning of each packet, which the subsequent CVSD-encoded bits willreference. If the system experiences a burst error, the PCM encoded bitsof a subsequent, valid, packet are used to reestablish the signal value,thus avoiding packet-to-packet error extension.

[0016] An embodiment of the present invention may be implemented via asoftware algorithm that encodes initial bits of each speech transmissionpacket as PCM and encodes the remaining bits of the packet as CVSD.

[0017] In the following detailed description of exemplary embodiments ofthe present invention, numerous specific details are set forth in orderto provide a thorough understanding of the described embodiments of thepresent invention. However, it will be apparent to one skilled in theart that alternative embodiments of the present invention may bepracticed without these specific details. In some instances, well-knownstructures and devices are shown in block diagram form, rather than indetail, in order to avoid obscuring the description of exemplaryembodiments of the present invention.

[0018]FIG. 2 illustrates a functional block diagram of theencoder/transmitter for encoding a speech signal in accordance with anembodiment of the present invention in which the system employs a 64-bitpacket, with four initial bits encoded as PCM.

[0019] The system 200, shown in FIG. 2 provides a clock signal to a CVSDcoder/decoder (codec) and a counter at operational block 201. Thecounter is set for the number of bits in a transmission packet (e.g.,64).

[0020] At operational block 202, the codec converts the input analogsignal to digital form (in this case using the CVSD encoding scheme). Attransmission the error-free speech signal is available and can beCSVD-encoded without risk of error propagation.

[0021] At operation block 203 the CVSD bits are converted to PCM bits.The signal has been encoded as CVSD on a continuous, error-free, basisand therefore, the waveform can be recreated and the PCM valuesdetermined. That is, a CVSD-encoded analog waveform is used to determinethe PCM bit values. At operational block 204 PCM bits are stored whenthe counter is reset.

[0022] PCM bits are used to assemble a data packet when the counter isless than four at operational block 205. That is, the initial four bitsof the packet will be PCM-encoded bits. The number of bits encoded asPCM depends upon the packet size and quality requirements of theparticular system.

[0023] When the counter is greater than three at operational block 206,the CSVD bits are output to packet assembly. The 64-bit packet isassembled using four initial PCM-encoded bits and 60 CVSD-encoded bitsat operational block 207. With such a packet construction, the value ofthe analog signal at the beginning of each packet may be establishedusing the initial PCM-encoded bits while the remaining CVSD-encoded bitsprovide the benefit of a relatively low bit rate requirement. For asystem using a 64 bit packet, it has been determined empirically thatthree PCM-encoded bits is sufficient for establishing the value of ananalog speech signal.

[0024]FIG. 3 illustrates a functional block diagram of thereceiver/decoder for decoding an encoded packet in accordance with anembodiment of the present invention in which the system employs a 64-bitpacket, with four initial bits encoded as PCM. At the receiver eachpacket is decoded independently using the initial PCM-encoded bits todefine the starting value rather than the end value from the precedingpacket.

[0025] The system 300, shown in FIG. 3 provides a clock signal to areceived packet and a counter at operational block 301. An errordetection process (e.g., cyclic redundancy check (CRC)) is performed atoperational block 302. In the event that a received packet containsmultiple errors, as determined by the CRC, the packet is discarded andthe previous packet repeated.

[0026] While the counter is less than four, the packet is decoded as PCMat operational block 303. That is, the first four bits are decoded asPCM and, provided the CRC is positive, the values are stored to a datalatch at operational block 304, thus providing the initial value for thedigital to analog converter.

[0027] At operational block 305, the remaining bits (i.e., bits 4through 63) are decoded as CSVD. The CSVD-decoded values are used toincrement/decrement the data latch and the data is input to a D/Aconverter at operational block 306.

[0028]FIG. 4 is a diagram illustrating an exemplary processing system400 for implementing an embodiment of the present invention. Theencoding and/or decoding of speech signals packets having a number ofinitial bits PCM-encoded and the remaining bits CVSD-encoded, asdescribed herein, may be implemented and utilized within processingsystem 400, which may represent a general-purpose computer, portable ormobile computer, or other like device. The components of processingsystem 400 are exemplary in which one or more components may be omittedor added. For example, one or more memory devices may be utilized forprocessing system 400.

[0029] Referring to FIG. 4, processing system 400 includes a centralprocessing unit 402 and a signal processor 403 coupled to a displaycircuit 405, main memory 404, static memory 406, and mass storage device407 via bus 401. Processing system 400 may also be coupled to a display421, keypad input 422, cursor control 423, hard copy device 424,input/output (I/O) devices 425, and audio/speech device 426 via bus 401.

[0030] Bus 401 is a standard system bus for communicating informationand signals. CPU 402 and signal processor 403 are processing units forprocessing system 400. CPU 402 or signal processor 403 or both may beused to process information and/or signals for processing system 400.CPU 402 includes a control unit 431, an arithmetic logic unit (ALU) 432,and several registers 433, which are used to process information andsignals. Signal processor 403 may also include similar components as CPU402.

[0031] Main memory 404 may be, e.g., a random access memory (RAM) orsome other dynamic storage device, for storing information orinstructions (program code), which are used by CPU 402 or signalprocessor 403. Main memory 404 may store temporary variables or otherintermediate information during execution of instructions by CPU 402 orsignal processor 403. Static memory 406, may be, e.g., a read onlymemory (ROM) and/or other static storage devices, for storinginformation or instructions, which may also be used by CPU 402 or signalprocessor 403. Mass storage device 407 may be, e.g., a hard or floppydisk drive or optical disk drive, for storing information orinstructions for processing system 400.

[0032] Display 421 may be, e.g., a cathode ray tube (CRT) or liquidcrystal display (LCD). Display device 421 displays information orgraphics to a user. Processing system 400 may interface with display 421via display circuit 405. Keypad input 422 is an alphanumeric inputdevice with an analog to digital converter. Cursor control 423 may be,e.g., a mouse, a trackball, or cursor direction keys, for controllingmovement of an object on display 421. Hard copy device 424 may be, e.g.,a laser printer, for printing information on paper, film, or some otherlike medium. A number of input/output devices 425 may be coupled toprocessing system 400. The process of encoding a number of bits of aspeech transmission packet using PCM encoding the remaining bits of thepacket using CVSD encoding, as well as the process of decoding packetsthusly encoded, in accordance with one embodiment of the presentinvention, may be implemented by hardware and/or software containedwithin processing system 400. For example, CPU 402 or signal processor403 may execute code or instructions stored in a machine-readablemedium, e.g., main memory 404.

[0033] The machine-readable medium may include a mechanism that provides(i.e., stores and/or transmits) information in a form readable by amachine such as computer or digital processing device. For example, amachine-readable medium may include a read only memory (ROM), randomaccess memory (RAM), magnetic disk storage media, optical storage media,flash memory devices. The code or instructions may be represented bycarrier-wave signals, infrared signals, digital signals, and by otherlike signals.

[0034] An embodiment of the invention optimally combines distinctencoding techniques to compensate for burst errors without incurringhigh transmission overhead. Error extension on frequency hopping radiocircuits is minimized so that speech is less subject to distortion andnoise burst when packets are lost as a result of collisions betweenfrequency hopping radio devices or outside interference.

[0035] In the foregoing specification, the invention has been describedwith reference to specific exemplary embodiments thereof. It will,however, be evident that various modifications and changes may be madethereto without departing from the broader spirit and scope of theinvention as set forth in the appended claims. The specification anddrawings are, accordingly, to be regarded in an illustrative senserather than a restrictive sense.

What is claimed is:
 1. A method comprising: encoding a first portion ofa signal using a first encoding technique; encoding a second portion ofthe signal using a second encoding technique; creating a transmissionpacket having a plurality of bits such that a first number of bitsrepresent the encoded first portion of the signal and a second number ofbits represent the encoded second portion of the signal.
 2. The methodof claim 1, wherein the signal is an analog speech signal and thetransmission packet is a speech transmission packet.
 3. The method ofclaim 2, wherein the first encoding technique is pulse code modulationand the second encoding technique is continuously variable slope deltamodulation.
 4. The method of claim 3, wherein the first number of bitsis sufficient to establish the value of the analog speech signal.
 5. Themethod of claim 3, wherein the speech transmission packet has 64 bitsand the first number of bits is three bits.
 6. A method comprising:receiving a transmission packet having a plurality of bits, a firstnumber of bits representing an encoded first portion of a signal encodedusing a first encoding technique, and a second number of bitsrepresenting an encoded second portion of the signal encoded using asecond encoding technique; decoding the first number of bits inaccordance with the first encoding technique; and decoding the secondnumber of bits in accordance with the second encoding technique.
 7. Themethod of claim 6, wherein the signal is an analog speech signal and thetransmission packet is a speech transmission packet.
 8. The method ofclaim 7, wherein the first encoding technique is pulse code modulationand the second encoding technique is continuously variable slope deltamodulation.
 9. The method of claim 8, wherein the first number of bitsis sufficient to establish the value of the analog speech signal. 10.The method of claim 8, wherein the speech transmission packet has 64bits and the first number of bits is three bits.
 11. A machine-readablemedium that provides executable instructions, which when executed by aprocessor, cause the processor to perform a method comprising: encodinga first portion of a signal using a first encoding technique; encoding asecond portion of the signal using a second encoding technique; creatinga transmission packet having a plurality of bits such that a firstnumber of bits represent the encoded first portion of the signal and asecond number of bits represent the encoded second portion of thesignal.
 12. The machine-readable medium of claim 11, wherein the signalis an analog speech signal and the transmission packet is a speechtransmission packet.
 13. The machine-readable medium of claim 12,wherein the first encoding technique is pulse code modulation and thesecond encoding technique is continuously variable slope deltamodulation.
 14. The machine-readable medium of claim 13, wherein thefirst number of bits is sufficient to establish the value of the analogspeech signal.
 15. The method of claim 13, wherein the speechtransmission packet has 64 bits and the first number of bits is threebits.
 16. A machine-readable medium that provides executableinstructions, which when executed by a processor, cause the processor toperform a method comprising: receiving a transmission packet having aplurality of bits, a first number of bits representing an encoded firstportion of a signal encoded using a first encoding technique, and asecond number of bits representing an encoded second portion of thesignal encoded using a second encoding technique; decoding the firstnumber of bits in accordance with the first encoding technique; anddecoding the second number of bits in accordance with the secondencoding technique.
 17. The machine-readable medium of claim 16, whereinthe signal is an analog speech signal and the transmission packet is aspeech transmission packet.
 18. The machine-readable medium of claim 17,wherein the first encoding technique is pulse code modulation and thesecond encoding technique is continuously variable slope deltamodulation.
 19. The machine-readable medium of claim 18, wherein thefirst number of bits is sufficient to establish the value of the analogspeech signal.
 20. The method of claim 18, wherein the speechtransmission packet has 64 bits and the first number of bits is threebits.
 21. An apparatus comprising a processor and a memory coupledthereto, the memory having stored thereon executable instructions whichwhen executed by the processor, cause the processor to encode a firstportion of an analog speech signal using a first encoding technique,encode a second portion of the analog speech signal using a secondencoding technique, and create a speech transmission packet having aplurality of bits such that a first number of bits represent the encodedfirst portion of the analog speech signal and a second number of bitsrepresent the encoded second portion of the analog speech signal. 22.The apparatus of claim 21, wherein the first encoding technique is pulsecode modulation and the second encoding technique is continuouslyvariable slope delta modulation.
 23. The apparatus of claim 21, whereinthe first number of bits is sufficient to establish the value of theanalog speech signal.
 24. An apparatus comprising a processor and amemory coupled thereto, the memory having stored thereon executableinstructions which when executed by the processor, cause the processorto receive a speech transmission packet having a plurality of bits, afirst number of bits representing an encoded first portion of an analogspeech signal encoded using a first encoding technique, and a secondnumber of bits representing an encoded second portion of the analogspeech signal encoded using a second encoding technique, decode thefirst number of bits in accordance with the first encoding technique,and decode the second number of bits in accordance with the secondencoding technique.
 25. The apparatus of claim 24, wherein the firstencoding technique is pulse code modulation and the second encodingtechnique is continuously variable slope delta modulation.
 26. Theapparatus of claim 24, wherein the first number of bits is sufficient toestablish the value of the analog speech signal.
 27. A systemcomprising: a transmitter to transmit a speech transmission packetrepresenting an encoded speech signal, the speech transmission packetcontaining a plurality of bits, a first number of the plurality of bitscorresponding to a first portion of the encoded speech signal and asecond number of the plurality of bits corresponding to a second portionof the encoded speech signal, the first portion of the encoded speechsignal encoded using a first encoding technique, and the second portionof the encoded speech signal encoded using a second encoding technique;and a receiver to receive the speech transmission packet from thetransmitter and decode the first number of bits in accordance with thefirst encoding technique and decode the second number of bits inaccordance with the second encoding technique.
 28. The system of claim27, wherein the first encoding technique is pulse code modulation andthe second encoding technique is continuously variable slope deltamodulation.
 29. The system of claim 28, wherein the first number of bitsis sufficient to establish the value of the analog speech signal. 30.The method of claim 28, wherein the speech transmission packet has 64bits and the first number of bits is three bits.